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Abstract. We report constraints on primordial non-Gaussianity from the abundance of X- 
^ ray detected clusters. Our analytic prescription for adding non-Gaussianity to the cluster 

mass function takes into account moments beyond the skewness, and we demonstrate that 
those moments should not be ignored in most analyses of cluster data. We constrain the am- 
plitude of the skewness for two scenarios that have different overall levels of non-Gaussianity, 
(^ characterized by how amplitudes of higher cumulants scale with the skewness. We find that 

C^ current data can constrain these one-parameter non-Gaussian models at a useful level, but 

. . are not sensitive to adding further details of the corresponding inflation scenarios. Combining 

_^ cluster data with Cosmic Microwave Background constraints on the cosmology and power 

S^ spectrum amplitude, we find the dimensionless skewness to be 10^A^3 = ~l-28 ^'-'^ °^^ °^ 

H our scaling scenarios, and lO^A^a = ~4 it 7 for the other. These are the first constraints 

on non-Gaussianity from Large Scale Structure that can be usefully applied to any model 
of primordial non-Gaussianity. The former constraint, when applied to the standard local 
ansatz (where the n-th cumulant scales as ^An oc ^A^~ ), corresponds to /n£^' = — 3^'^g]^. 
When applied to a model with a local-shape bispectrum but higher cumulants that scale 
as ^An oc A^3 (the second scaling scenario), the amplitude of the local-shape bispectrum 
is constrained to be /^l^'* = — 14]'l2i. For this second scaling (which occurs in various 
well- motivated models of inflation), we also obtain strong constraints on the equilateral and 
orthogonal shapes of the bispectrum, /^^' = —bl'^jg and f^^ = 63^^q4. This sensitivity 
implies that cluster counts could be used to distinguish qualitatively different models for the 
primordial fluctuations that have identical bispectra. 
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1 Introduction 

The Large Scale Structure of the late Universe depends on a rich array of physics: the spec- 
trum of primordial curvature inhomogeneities, the cosmological evolution of the Universe, the 
rules governing the growth of structure, and particle physics, chemistry and thermodynamics 
within individual stars, galaxies, and galaxy clusters. Extracting details of the primordial 
fluctuations is necessarily a difficult problem, but fortunately there are several complemen- 
tary observables available to us. In this work, we use measurements of the mass and redshift 
distribution (the mass function) of a sample of galaxy clusters to constrain primordial non- 
Gaussianity. We demonstrate that this is a complementary probe to the Cosmic Microwave 
Background (CMB) bispectrum and the halo bias because it is sensitive to different aspects 
of the primordial non-Gaussianity. 

Most work on cosmological constraints from clusters has focused on dark energy [1- 
3]. Here we apply the substantial progress made on characterizing the mass-observable 
relations in that context to study primordial non-Gaussianity. We use 237 X-ray bright 
clusters detected in the ROSAT All-Sky Survey [4] and the analysis techniques of Mantz et 
al. [5] to investigate two one-parameter models for the non-Gaussian curvature perturbations 
and four two-parameter models. The clusters in this sample have redshifts up to z = 0.5 
and masses of order 10^^-10^^ Mq. With the semi-analytic, non-Gaussian mass function 
extended to include terms beyond the skewness, we find constraints that are completely 
consistent with Gaussian statistics for the primordial fluctuations. However, contrary to 
some of the expectations in the literature [6-8], we find error bars small enough to indicate 
that cluster counts can provide complementary information to current constraints from the 
CMB bispectrum [9, 10] and from the galaxy bias [11-14]. Our results are consistent with 
but tighter than those recently obtained using clusters detected by the South Pole Telescope 
(SPT) [15, 16] and clusters selected from the Sloan Digital Sky Survey (SDSS) [17]. 

Statistics of the primordial fluctuations beyond the homogeneous and isotropic power 
spectrum are an extremely important source of information about the very early Universe 



- so much so that pursuing hniits on non-Gaussianity down to the minimal levels expected 
from single field slow-roll inflation is an important task. Galaxy clusters form from rare 
primordial over-densities, with more massive and higher redshift clusters tracing rarer initial 
fluctuations, and so their abundance is sensitive to non-Gaussianity in the primordial inho- 
mogeneities. Constraints from cluster number counts are complementary to other probes of 
non-Gaussianity in three ways: they probe smaller scales than the CMB or galaxy bias do 
(cluster constraints are at k ^ 0.1-0.5 /i/Mpc), they are sensitive to any non-Gaussianity 
(including any shape for the primordial bispectrum), and they are sensitive to higher order 
cumulants of the probability distribution of the primordial inhomogeneities. 

The non- Gaussian statistics of the primordial perturbations are completely described by 
the set of correlation functions ($(A;i)$(A;2) . . . ^{kn))c, where $(/e) is the Bardeen potential 
in momentum space, and n runs from 3 to infinity. The subscript c stands for 'connected' and 
picks out the parts of the correlations that vanish for a Gaussian field. Clearly, a single pa- 
rameter cannot describe this series of functions. However, if the field is weakly non-Gaussian, 
the three-point function is likely to generate the strongest observational signal, so many non- 
Gaussian models are classified by naming the configuration of the three-point function and 
its amplitude. The Wilkinson Microwave Ansiotropy Prove (WMAP) satellite, for example, 
reports constraints on the parameters /nl'^\ /nl"^ ' /nl'^' where the labels 'local', 'equilat- 
eral', and 'orthogonal' refer to specific, scale-independent functional forms assumed for the 
three-point correlation. Translation invariance requires that the three-point correlation has a 
factor Sjj{ki + k2 + ^3) (a Dirac delta function), similar to the term 6^{ki + ^2) in the power 
spectrum. The three-point correlation is then just a function of two independent momenta 
and is called the bispectrum. 

The local, equilateral and orthogonal bispectra are shown in Eq.(2.3) below. Inter- 
estingly, though, object number counts are not sensitive to the details of the bispectrum's 
momentum dependence. Instead, only the integrated moments of the smoothed density fluc- 
tuations 6 Ft are relevant. For example, 

^^^^ = lj2^^jj2^^j ^^^^^''^' ^^^(^2' ^' ")^(^3' ^' ^)(^(^l)^(^2)$(fc3))c 

(1.1) 

where the terms M(ki,R,z) contain a window function, the corresponding factors from the 
Poisson equation, the transfer function and the growth factor converting the linear perturba- 
tion in the gravitational potential to the smoothed density perturbation. (The full expressions 
for these quantities can be found in Appendix A.) We characterize the non-Gaussianity by 
the dimensionless ratios of the cumulants of the density field 

which are by construction redshift independent and nearly independent of the smoothing 
scale, R, if the primordial bispectrum is scale independent.^ 

When cumulants beyond the skewness (correlations beyond the bispectrum) are relevant, 
a one-parameter model is only useful if we can use it to specify the amplitude of all the 
correlations. In this paper we use AI3 and a choice for how higher moments scale with Ai^ 
to describe non-Gaussian fluctuations. The scalings we consider are motivated by particle 



'^Scale independence means that the bispectrum (e.g., those in Eq.(2.3)) contains no length scale other 
than the factors k^^ in the P(ki) terms. 



physics models of inflation, and our constraints on the total dimensionless skewness can 
always be re-written in terms of a particular bispectrum using Eq.(l.l) and Eq.(1.2). 

Most previous work on the utility of cluster counts to constrain non-Gaussianity has 
focused on the local ansatz [18, 19], where one assumes that the non-Gaussian field ^{x) is 
a simple, local transformation of a Gaussian field ^g{x): 

^x) = <I>g{x) + f^t\<^Gixf - {-^Gixf)]. (1.3) 

In this useful model, /nl*^' is the single parameter that all correlation functions depend on, and 
the cumulants scale as (/nl )'^~^- Non-Gaussianity of the local type has a bispectrum that 
most strongly correlates Fourier modes of very different wavelengths. This particular mode 
coupling generates strong signals in other large scale structure observables - most notably 
introducing a scale dependence in the bias of dark matter halos, luminous red galaxies and 
quasars [11, 14, 20]. For this reason, papers that have analyzed the potential for future surveys 
to constrain non-Gaussianity have largely focused on non-Gaussianity captured by the local 
ansatz only, and on the superior constraints from the bias compared to number counts. The 
bias may allow us to probe A/^^ ~ C'(l) in the near future [21, 22], although this optimism 
is still subject to a full understanding of the relevant systematics [14, 23, 24]. Regardless, the 
motivation for looking at cluster number counts to constrain a scale-invariant, local ansatz 
is certainly weak. However, single field models of inflation cannot produce large local non- 
Gaussianity over a wide range of scales [25]. Since non-Gaussianity that does not strongly 
correlate modes of very different wavelength is not particularly detectable in the galaxy bias 
[13, 26, 27], that measure alone tests only a subset of viable inflation models. Furthermore, 
number count constraints are sensitive to higher order correlation functions. As we will 
demonstrate here, number counts can distinguish between scenarios with indistinguishable 
bispectra that nonetheless are generated by qualitatively different inflationary physics. That 
is, a model may have a local-shape bispectrum but higher order moments that do not scale 
as those from the standard local ansatz do (e.g., [28]). In this case, number counts or other 
measurements sensitive to higher moments will provide complementary information to the 
halo bias constraints. For the rest of the paper, we will assume that fj^j] is a parameter 
measured from the bispectrum alone (see Eq.(2.3) below), which does not imply the entire 
series of correlations that Eq.(1.3) generates, unless we specify the local ansatz as our model. 
This use of /nl*^' is more in line with how it is observationally deflned (e.g., in analyses of 
the CMB and halo bias). 

This paper provides the flrst constraint on primordial non-Gaussianity from X-ray de- 
tected clusters, and the flrst large scale structure constraint that can be usefully applied to 
any model for primordial non-Gaussianity. In the next Section, we define our parametriza- 
tion of the effects of primordial non-Gaussianity on cluster abundance, using a semi-analytic 
non-Gaussian mass function in terms of a single new parameter. In Section 3 we discuss sev- 
eral theoretically motivated extensions to two parameters. We present our results for the one 
and two parameter models in Section 4. In Section 5, we discuss how our results compare to 
previous results and forecasts in the literature, including the SPT constraints reported from 
two small samples of Sunyaev-Zel'dovich (SZ) detected clusters [15, 16] and those from a 
large sample of optically selected clusters [17], the SDSS maxBCG cluster catalogue [29]. We 
summarize in Section 6. Appendix A contains the details of our semi-analytic mass function 
prescriptions as well as others that exist in the literature.^ Quoted error intervals are always 
68.3% confidence, unless otherwise specified. 

^We have shared some computer code helpful for evaluating our mass function; see Appendix A. 



2 The effect of Primordial non-Gaussianity on object number counts 

Our basic tool is a series expansion for the ratio of the non-Gaussian mass function to 
the Gaussian one. The expansion we use is based on a Press-Schechter model for halo 
formation applied to non-Gaussian probability distributions for the primordial fluctuations. 
A detailed derivation of the non-Gaussian mass function we use is given in Appendix A 
and was developed in [30-32]. The weakly non-Gaussian probability distributions that the 
mass function is based on are asymptotic expansions that deviate substantially from the 
actual probability density function (PDF) for sufficiently rare fluctuations. Fortunately, our 
cosmology is already sufficiently constrained to determine that the clusters in our sample 
do not lie in that regime. However, the clusters are sufficiently rare that truncating the 
expansion below at a single term (the skewness) is not sufficient to test the full range of 
models that are only as skewed as current CMB constraints allow. 

We add non-Gaussianity to the cosmology by considering a mass function of the form 



/ dn\ / dn \ I tt-ng 



(2.1) 

Edgeworth / 



where the first term on the right hand side is the Gaussian mass function of Tinker et al. 
[33] for clusters identified as spheres containing a mean density 300 times that of the mean 
matter density of the Universe, 300;0in(2)- The ratio of the non-Gaussian mass function to 
the Gaussian one will be given as a series expansion, defined below. This factor will be 
a function of mass, redshift, and parameters that characterize the amplitude of the non- 
Gaussianity, which we define next. 

2.1 Parametrizing the level of non-Gaussianity 

Since object number counts are not sensitive to the details of the momentum space corre- 
lations, we consider the dimensionless, connected moments (the cumulants, divided by the 
appropriate power of the amplitude of fluctuations) of the density fluctuations smoothed on 
a given scale i?, as defined in Eq.(1.2). Most constraints on non-Gaussianity have so far been 
reported for a parameter that measures the size of the three-point correlation in momentum 
space, or bispectrum. This is an extremely useful first statistic because this correlation should 
be exactly zero if the fluctuations were exactly Gaussian. However, because the bispectrum 
is a function of two momenta, the non-Gaussian parameters most often quoted assume a 
shape for the bispectrum. 

A generic homogeneous and isotropic bispectrum for the potential ^ can be written as 

;$(^l)$(^2)$(fc3)\ ={2T^f5l{ki+k2 + k^)B{ki,k2M) (2.2) 



where the function B[ki^k2,k^) determines the shape. Bispectra are colloquially named by 
the (triangle) configuration of the three momentum vectors that are most strongly correlated. 
To interpret our constraints on A^3 in terms of familiar bispectra, we consider the templates 
for 'local', 'equilateral' and 'orthogonal' bispectra: 

i?local = 2f'^l^\P{ki)P{k2) + P{ki)P{ks) + P{k2)P{ks)) (2.3) 

i^cquii = 6fZ^[-P{ki)P{k2) + 2 perm. - 2{P{ki)P{k2)P{k^)f''' 

+P{kif/^P{k2f/'^P{k:i) + 5 perm.] 
Sorth = 6/§i*^[-3P(A;i)P(A;2) + 2 perm. - S{P{ki)P{k2)P{k:,)f'^ 

+W{kif/^P{k2f'^P{k^) + 5 perm.] 



where the power spectrum, P{k), is defined from the two-point correlation function by 



<i>{ki)^{k2) ) = {2TiY5Uki + k2)P{ki) = {2TTy5Uh + ^2)2^ 



2A|(/co) fki 



k^ 



h 



n..-l 



(2.4) 



In the best fit cosmology from the seven-year WMAP data, baryon acoustic oscillations and 
Hubble parameter measurements, the spectral index is n^ = 0.967 [34, 35], and the amplitude 
is such that o-g = 0.81. Observationally, the parameter /^l'^^ is typically measured by looking 
for a bispectrum of the form given in the first line of Eq.(2.3), which has weaker implications 
than the definition of all the correlation functions as in Eq.(1.3). The best current limits 
on the amplitudes of the bispectra in Eq.(2.3) come from the Planck Satellite maps of the 
CMB [10], which limit f'^l''^ = 2.7 ± 5.8, /^'^ = -42 ± 75, and f^'l^ = -25 ± 39 at the 
68.3% confidence level. Table 1 shows the value of Ais, smoothed on a scale corresponding 
to 10^^ /i~^ Mq halos (/i is related to the Hubble parameter today, Hq = 100/ikm/s/Mpc), 
for the local, equilateral, and orthogonal templates using the WMAP7 best fit cosmology. 



Table 1. The conversions between the parameter M3 and the amplitudes of particular bispectra 
/nl- These numbers assume the WMAP7 best fit cosmology and change by at most a few percent if 
the best fit cosmologies from our analysis are used instead. 



Shape 


Ms 


Local 
Equilateral 
Orthogonal 


0.00031 f^l""^ 

0.000086 /nI"" 

-0.000062 f^'l^ 



For some non-Gaussian scenarios (notably the local ansatz and typical single field mod- 
els) the parameter M-3 is interchangeable with the /nl'^\ /nl'' > ^^ Ikl^ ^^ ^ description 
of the amplitude of the three point function and as a measure of the total non-Gaussianity 
for the entire series of higher order correlations. This is possible when the cumulants scale 
parametrically as 

M"^ oc (iv'^y , (2.5) 

where I is proportional to the appropriate /nl parameter. Although it is not needed for the 
simplest models, we use this more general notation, since it is useful for the two-parameter 
scenarios we introduce below. The superscript 'h' labels the scaling in Eq.(2.5), which we 
call hierarchical. 

Mathematically, the higher order correlations could be nearly arbitrary, and it is only 
because we hope they have a common origin in some perturbation theory that it seems likely 
they are related. In this paper, we will contrast a second possible scaling, occurring in models 
where the scalar inflaton couples to a gauge field, that is much more non-Gaussian than the 
hierarchical scenario for a fixed value of the skewness [28, 31]. This scaling, which we call 
feeder (since it originates in models where fluctuations of a second field provide an extra 
source for the inflaton fluctuations^) is 



M],(x X" 



n > 3. 



The equation of motion for the inflaton fluctuations 5(p is [dt 



(2.6) 
^V^ + 3Hdt + m'^]S<p = J, where J is 



a source that depends on the quantum fluctuations of fields coupled to the inflaton [31]. 



Object number counts are sensitive to the value of the total skewness and to the scaling of 
higher moments, rather than any details of the momentum space correlations. 

In addition to the dependence on a parameter like /nl, the cumulants also have nu- 
merical coefficients that typically have to do with combinatorics. For example, beginning 
with Eq.(1.3), the bispectrum contains three terms linear in /^l > each with two equivalent 
ways to take the expectation value of pairs of fields $g- We will the choose the constants 
of proportionality equal to combinatoric factors for the moments that are generated in the 
local ansatz and a simple two-field extension that gives feeder scaling:^ 

Hierarchical M^ = n\ 2"-^ (^Y (2.7) 



Feeder M 



l. = {n-m-^[^Y'\ (2. 



For a given scaling of the moments, we can determine a series expansion for the probability 
distribution and for the mass function that can be consistently truncated at some order in 
the moments. 

For the single parameter scenarios, we report constraints in terms of the scaling as- 
sumed and the parameter A^s, which can be compared with other constraints on particular 
bispectrum shapes using Table 1. 

2.2 The mass function in terms of Ai^ and the scaling of higher moments 

We will assume the non-Gaussian factor in the mass function of Eq.(2.1) takes the following 
form: 

nNG 



"-G 



Edgeworth ^^W ^^W ^ ' ^ 

Each term in the series is normalized by the Press-Schechter Gaussian term, Fq{M) = 
(e~'^c/2/y^27r)(d(T/dM)(fc/o"), where Vc = (^c/f, 6c = 1.686 is the collapse threshold, and a = 
a{M) is the variance in density fluctuations smoothed on the appropriate scale (Eq.(A.4)). 
Although the first term, Fi{M) or FI'{M), is proportional to Ai^ regardless of how the 
higher moments scale, the exact form of all higher order terms depends on the choice of 
scaling. For the hierarchical and feeder scaling, F^'{M) and F^'{M) are given in Eq.(A.14) 
of the Appendix. Truncating this series after the first term is clearly unphysical since no 
probability distribution with only a non-zero skewness can be positive everywhere. Although 
for some objects (low mass, low redshift) this truncation does not cause a significant error, 
for rarer fluctuations it does. Keeping higher terms in the series is therefore important. How 
significant these terms are in the context of cluster constraints depends on the mass and red- 
shift of the objects as well as the amplitude and scaling of the non-Gaussianity considered. 
In Section 5, we show several examples to illustrate how relevant the higher terms are as a 
function of mass, redshift, skewness and scaling. Although this mass function has been shown 
to agree reasonably well with simulations, it does not come from a first principles derivation. 
In Section 5 we also contrast it to the Dalai et al mass function from simulations of the local 
ansatz [20]. 



■^The local ansatz was given in Eq.(1.3) with Mi^i^dx)^) ^ 1 to ensure weak non-Gaussianity. The 
moments generated have the hierarchical scaling with X = /nl- To obtain representative combinatorics for 
the feeder scaling, we use a scenario where one Gaussian field and one subdominant but highly non-Gaussian 
field contribute to the inhomogeneities in the gravitational potential: ${x) — 4'G+o'G + f'NL[o'G{x)'^~{o'G{x)^)], 
with /NLpy^ > 1. In that case I = f^LV^/Vi^'. 



3 Two parameter extensions 

We also consider the ability of the data to constrain four models characterized by two param- 
eters, chosen to match classes of non-Gaussian inflation models. One very natural extension 
is to introduce an additional parameter, < g < 1, so that the hierarchical moments of the 
non-Gaussian density field behave as 

Ml= n\2--^qi^^j . (3.1) 

Moments of this type arise in scenarios when two fields, one Gaussian and the other with 
weak local-type non-Gaussianity, contribute to the primordial fluctuations. In that case the 
total primordial gravitational perturbation is 

^x) = 0g(x) + Vg(x) + /nl(^^ - (V'g)) • (3-2) 



The ratio of the contribution of (the Gaussian part of) the field V'G to the total power is the 
parameter q: 



"G/ + (V'g/ 



a2 \ I /,/,2 



(3.3) 



and I = q/nl- Since significant local non-Gaussianity only arises in multi- field models, this 
model is quite plausible. In the standard local ansatz, only one of the terms in Eq.(3.2) 
contributes to the fluctuations in the gravitational potential, but that field is distinct from 
the field that sourced inflation. In other words, if <J3q represents the fluctuations of the 
inflaton, the usual local ansatz corresponds to Eq.(3.1) with g — )■ 1. In these inflation 
models, both fields must be very light compared to the Hubble scale during inflation, with 
masses mi^2 ^ H. 

We can similarly introduce a second parameter into the feeder scaling by defining 

Ml = q{n-l)l2-~U'-^) (3.4) 



again with < g < 1. Such a coefficient appears in the inflation scenario of [31] and in the 
'quasi-single field' models introduced by [36] where there is an additional heavy field relevant 
for the fluctuations (with mass very close to the Hubble scale during inflation). 

For either choice of scaling for the cumulants, we will take Ai^ and q as the free param- 
eters. For flxed A^3, smaller q then corresponds to boosting the relative importance of higher 
order cumulants, while ^ = 1 corresponds to our one-parameter models. For the scenarios 
to remain weakly non-Gaussian, we need M^ /q ^ 1. If our cosmology is consistent with 
Gaussian primordial fluctuations, we expect the data to favor g « 1 and small M3. 

As another extension of the one-parameter models, we add scale-dependence. One ad- 
vantage of cluster number count constraints is the relatively small scale they probe compared 
to the CMB. This suggests that clusters may be particularly useful when used in conjunction 
with the CMB to constrain scale-dependent non-Gaussianity [30]. To that end, we consider 
a scale-dependent ansatz with hierarchical scaling of the cumulants: 

/ p \ "3 

M3,R = 6Ir, — , (3.5) 



RqJ 



-3/^-^3 



Mn,R = n\ 2"-^ ( 



n-2 



where 77.3 is the additional model parameter. We chose the pivot scale to be Rq = 8h ^ 
comoving Mpc. Similarly, we can add scale dependence to the feeder type scenarios: 

M3,R = 8lR,^—j , (3.6) 



Mn,R = (n-l)!2"-M 

4 Application to galaxy cluster data 

We now investigate the constraints on these models from current galaxy cluster data. In this 
work, we employ the data and analysis of [3, 5], to which we refer the reader for complete 
details. 

Briefly, the cluster data set consists of 237 X-ray bright clusters detected in the ROSAT 
All-Sky Survey [4] and compiled in the BCS, REFLEX or bright MACS catalogs [37-39] sat- 
isfying conservative luminosity/flux thresholds such that the physical properties and selection 
function of the final sample are well understood.^ These detections span the redshift range 
< z < 0.5; in Figure 1 their distribution in redshift and mass is compared to SZ detections 
that have been used in previous non-Gaussianity constraints. In addition to survey X-ray flux 
measurements and spectroscopic redshifts, archival ROSAT or Chandra data for 94 of the 
clusters were used to obtain mass proxies in the form of temperature and gas mass measure- 
ments. Total mass measurements from hydrostatic analysis of Chandra data for 42 relaxed 
clusters were also incorporated [40].^ Together these data provide the ability to calibrate 
the scaling and intrinsic scatter with mass of cluster X-ray luminosity, gas temperature and 
gas mass. This information is critical to accounting for survey selection effects and provides 
improved cosmological constraints generally compared with a pure self-calibration approach. 
Full details of our scaling relation model and the data used to constrain it are found in [3, 5]. 

Through a technique detailed in [40], measurements of the ratio of gas mass to total 
mass for the 42 relaxed clusters above additionally and independently provide constraints 
on the inean cosmic matter density and the expansion of the Universe, which we also take 
advantage of here. We additionally incorporate seven-year WMAP constraints in some of 
the results below, using the publicly released WMAP data and analysis codes [35]. We do 
not model the effects of non-Gaussianity on the CMB power spectrum in this analysis, which 
is justified since even the current constraints limit any shift due to non-Gaussianity to be 
small; consequently, the WMAP data effectively only provide additional constraining power 
on the standard set of cosmological parameters, particularly as- 

Following the method outlined in Section 2, we evaluate the non-Gaussian mass function 
as the product of a Gaussian mass function and a series whose terms depend on the model 
under consideration (see Eq.(2.1) and Eq.(A.13)). As in [3, 5], the Gaussian mass function 
we use is the simulation-calibrated fit of [33] , with the cosmology-dependent correspondence 
of halo mass and a{M) provided by camb.^ Note that, independent of any non-Gaussianity, 
our treatment includes an allowance for systematic uncertainty in the normalization, shape. 



^ Since the original work, Abell 689 has been removed from the data set, since Chandra observations reveal 
its X-ray emission to be dominated by a point source rather than the intracluster medium. 

®Note that this sample of 42 only partly overlaps the larger cosmology sample, a fact which is accounted 
for in the data analysis. 

http://www. camb. info 
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Figure 1. For illustrative purposes, the mass and redshift distribution of our cluster data [5] 
are compared with South Pole Telescope detected clusters used to obtain constraints on /^l''' by 
Williamson et al. [16] and Benson et al. [15]. The mass values assume a flat concordance model with 
h = 0.7 and 17^ = 0.3. 



and evolution of the Gaussian mass function in the form of a multidimensional prior on the 
relevant parameters [3]. This prior translates to a 10% uncertainty on the Gaussian mass 
function at the typical mass of our clusters (10^^ Mq). We evaluate the non-Gaussian series 
given by Eq.(A.13) and Eq.(A.14), keeping at most 16 terms for the hierarchical model and 
17 for the feeder. This is a large enough number of terms that the series expansion of the 
PDF (estimated by the relative size of the first ignored term) is accurate to at least 20% 
for the highest redshift, most massive clusters in the sample and Ai^ < 0.04 (0.025) for 
hierarchical (feeder) scaling. However, for the low values of Ai^ that the data prefers, a 
considerably smaller number of terms (around 5) is sufficient for the PDF to be accurate to a 
few percent in the range of interest, and there is not much difference in the mass function as 
the number of terms is changed. Furthermore, for larger levels of non-Gaussianity (outside 
the observational bounds, but still weakly non-Gaussian) the series can be badly behaved if 
too many terms are kept. See [32] for more detail. 

Constraints on non-Gaussianity from clusters are limited by the precision with which 
cluster masses can be estimated, both individually (i.e. constraining the masses of the most 
rare objects) and statistically (constraining the mass function of the population). The former 
is straightforwardly related to the limitations of individual cluster measurements, while the 
latter also depends on the scaling and intrinsic scatter with mass of the survey observable 
used to define the cluster sample, as well as the ability of the data to constrain those nuisance 
parameters. In the present work, both our estimates of individual masses and the overall 
cluster mass scale include a systematic error budget of ~ 15%; these reflect allowances for a 
variety of uncertainties such as instrument calibration and departures from hydrostatic equi- 
librium [5, 40]. Including all these allowances, our data ultimately provide a 10% constraint 
on the normalization of the X-ray luminosity-mass relation, with the best fitting intrinsic 
scatter of that relation measured to be (43 it 4)% [5]. 

When analyzing cluster data, we vary the mean baryon and matter densities (Ob and 
Om), the Hubble parameter (h) and the amplitude of the matter power spectrum (ds)) in 
addition to the parameters describing non-Gaussianity and cluster scaling relations, and a 
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Figure 2. Marginalized joint constraints on Ai^ (or the corresponding f^ff^^) and ug at 68.3% 
and 95.4% confidence levels from cluster and clusters-|-CMB data, for single-parameter non-Gaussian 
models. Only CMB power spectra (not bispectra) are used in the combination, tightening constraints 
on the standard set of cosmological parameters, but providing no additional, direct constraining power 
on non-Gaussianity. The constraints are consistent with Gaussianity in all cases, but their strength 
and character depend on whether the hierarchical or feeder scaling (left and right panels) are used. 

number of nuisance parameters accounting for various systematic uncertainties (see [3, 40]). 
Priors on the Hubble parameter, h = 0.742 it 0.036 [41], and on the baryon density from 
Big Bang Nucleosynthesis (BBN), Q^h'^ = 0.0214 ± 0.002 [42], are included. When including 
WMAP data, we additionally marginalize over the optical depth to reionization, the spectral 
index of scalar fluctuations and its running with wavenumber, and the amplitude of small- 
scale CMB fluctuations due to the Sunyaev-Zel'dovich effect; the Hubble parameter and BBN 
priors are not used in this case. The full suite of our results are shown in Tables 2 and 3. 

Marginalized joint constraints on Ai^ and ug are shown in Figure 2 for the single- 
parameter non-Gaussian model, using both hierarchical and feeder scalings. In both cases, 
the principal degeneracy of A^3 is with erg when only cluster data are used. This is intuitive, 
since erg determines the relative rarity of massive clusters even in the purely Gaussian case. 
With the addition of CMB data, erg is independently tightly constrained, and for both scalings 
the primary degeneracy of Ms is instead with the slope of the cluster X-ray luminosity-mass 
relation, Pim, as shown in Figure 3. This reflects the dependence of the number of detections 
of massive clusters in an observable- limited survey on the relevant scaling relation. Compared 
with the hierarchical scaling, the degeneracy between Ai^ and other model parameters is 
generically weaker in the feeder model. This is due to the relatively larger high-order non- 
Gaussian moments generated by the feeder scaling, whose effect is less easily mimicked by 
changing other parameters (Figure 6; see also Figure 1 of [28]). 

We also consider two-parameter non-Gaussian models for the two scalings, including 
either the q or n^ parameters introduced in Section 3. In neither case is the additional 
parameter well constrained by the data, although the impacts on the Ms constraints are 
relatively minor (Table 2). However, we find that values g ~ 0, corresponding to having 
strongly non- Gaussian fluctuations regardless of the amplitude of AI3, are disfavored (see 
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Figure 3. As in Figure 2, but showing constraints on Ms and f3im, the power-law slope of the 
cluster X-ray luminosity-mass relation. 

Table 2. Marginalized best fitting values and 68.3% confidence intervals on the most interesting 
fit parameters (see text), using cluster (CL) and CL+CMB data, for various models (choices of the 
scaling and the number of free parameters), 'skew-only' refers to truncating the series in Eq.(A.13) 
after the second term, in which case there is no distinction between hierarchical (h) and feeder (f) 
scalings. For the scale-dependent models, the amplitude of the skewness is reported at the pivot 
point. °A small local maximum in probability in the range [—33, —28] is also formally included in this 
maximum likelihood confidence region (see also Figure 5). 
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Figure 4). The clusters+CMB data respectively provide 95.4% confidence lower limits of 
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Table 3. The constraints on the skewness can be converted to constraints on the amplitude of any 
bispectrum. The shape of the bispectrum is independent of the scaling, although the usual local 
ansatz corresponds to a local-shape bispectrum with hierarchical moments. 
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Figure 4. Posterior probability density of the q parameter for hierarchical and feeder scalings. In 
our parametrization, small values of q boost higher order non-Gaussian moments relative to Ai^ and 
q = I corresponds to the one-parameter models with either scaling. The clusters+CMB data disfavor 
very small values of q, consistent with the overall preference for Gaussian initial fluctuations, although 
only the feeder scaling shows a clear (if modest) preference for q — 1. At 95.4% confidence, the lower 
limits on q are respectively 5 > 0.10 and g > 0.18 for the hierarchical and feeder scenarios. 



q > 0.10 and q > 0.18 with the hierarchical and feeder scalings, assuming a uniform prior 
from zero to one. 

For ease of comparison to the literature (Section 5), we also obtained constraints on 
7W3 keeping only the first term in the non-Gaussian mass function, proportional to the 
skewness. As shown in Table 2, the resulting error bars are larger than when we include more 
terms in the series, reflecting the fact that those terms generically increase the deviation of 
the mass function from the Gaussian one at higher masses and higher redshifts (again, see 
Figure 6). For example, keeping only the first term, our constraints from cluster and CMB 

~^-js (Table 3). When including all relevant terms we find 



data correspond to /^l^' 



/nl^^ = —3^9^ for the hierarchical scaling and f^{^^^ = — 14^2i foi^ the feeder scaling. 

In light of the favorable comparison of our results to those obtained previously (see Sec- 
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Figure 5. As in Figure 2, but comparing constraints obtained from the full cluster data set with 
those from only clusters at redshifts z < 0.3. 

tion 5), we briefly investigate what characteristics of our data set influence the results. There 
is a practical limitation, however, since any attempt to reduce the overall size of the sample, 
the number of clusters with mass estimates from follow-up data, or the mass/redshift ranges 
covered, necessarily impacts constraints on the full set of cosmological and scaling relation 
parameters. Consequently, we confine ourselves to a single, limited, but informative compar- 
ison by asking how our constraints change when data at 2: > 0.3 are excluded. In detail, this 
low-redshift sample contains 203 clusters, of which 61 have follow-up data, compared to 237 
and 94 for the full data set. As shown in Figure 5 for single-parameter non-Gaussian models 
using the full hierarchical and feeder scalings, the constraining power of this low-redshift data 
set is significantly reduced. Results for the low-redshift clusters only are shown in Table 2. 

5 Comparison with the Hterature 

Previous forecasts for constraints on non-Gaussianity from cluster counts have been done by 
Pillepich et al. [7] for the eROSITA X-ray mission, Sartoris et al. [43] for future X-ray surveys 
resembling the Wide Field X-ray Telescope concept, Oguri [44] for a variety of future optical 
surveys, Cunha et al. [6] for optically selected clusters in the Dark Energy Survey (DES), 
and Mak and Pierpaoli [8] for future surveys using the Sunyaev-Zel'dovich effect. There have 
been three previous cluster constraints on non-Gaussianity: two based on clusters detected 
in the SPT survey, by Benson et al. [15], who find /^l^' = —192 it 310, and Williamson et al. 
[16], who report /^l^' = 20 ± 450; and one based on the SDSS maxBCG cluster catalogue. 



by Mana et al. [17], who have f^l^^ = 282 ± 317.^ 

The existing forecasts and constraints use a variety of prescriptions for the non-Gaussian 
mass function (listed in Table 4). These mass functions are given in Eq.(A.19) in the Ap- 
pendix and are plotted in Figure 6. The levels of non-Gaussianity shown are M3 = 0.009 



*This result corresponds to their analysis of only cluster number counts, without including either the cluster 
power spectrum or CMB data. 
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and A^3 = 0.031, which correspond to the local model bispectrum with /^l^' = 30 and 100, 
respectively. Notice that, below about 10^^ M0, the Dalai et al. mass function [20] deviates 
a little less from the Gaussian for a given value of /nl than the LoVerde et al. mass function 
[30] (hereafter LMSV). However, some authors have found the LMSV mass function agrees 
better with simulation results if a reduced collapse threshold, 6c ~ 1.5, is used. If that ad- 
justment is made, the Dalai et al mass function would deviate more from the Gaussian than 
LMSV; see [45] for a comparison of all these cases. Since the Dalai et al. mass function was 
calibrated on simulations of the local ansatz, in principle it should include information about 
higher moments. This technique, though, has only been tried against one set of simulations 
and only for non-Gaussianity of the local type. A more precisely calibrated, more general 
non-Gaussian mass function will be important for any future analysis of non-Gaussianity 
with clusters. 

Table 4. Gaussian mass functions and non-Gaussian extensions used in the literature. The non- 
Gaussian mass functions are either the first order semi-analytic expression from LoVerde et al. [30] 
(LMSV) or the mass function calibrated on N-body simulations of the local ansatz by Dalai et al. 
[20] . All non-Gaussian mass functions also make use of a Gaussian mass function such as those fit by 
Sheth and Tormen [46], Warren et al. [47], Jenkins et al. [48] or Tinker et al. [33, 49]. 



Author 


Mass Function used 


Benson [15] 


Jenkins + /oaiai 


Cunha [6] 


Jenkins + /oaiai 


Mak [8] 


Tinker + /LMSV.skew only 


Mana [17] 


Tinker -|- /LMSV,skew only 


Oguri [44] 


Warren -h /LMSV.skew only 


Pillepich [7] 


Tinker -|- /LMSV,skew only 


Sartoris [43] 


Sheth-Tormen + /LMSV.skew only 


Williamson [16] 


Jenkins + /oaiai 


This work 


Tinker -|- /LMSV.many terms 



Apart from the non-Gaussian mass function, these forecasts and analyses differ from 
one another and from ours in two principal ways: the form and complexity assumed for the 
mass-observable relation and its intrinsic scatter, and priors on the associated parameters. 
The most pessimistic forecasts in the literature find marginalized one sigma errors on /jIJl 
around O(IO^) (e.g, some cases analyzed in [6, 7]). Those results assume that the scaling 
relations will be constrained solely through self-calibration [50] rather than with estimates 
of cluster masses, which can significantly boost the constraining power [51]. In addition, 
some forecasts assume significant photometric redshift errors [7]. As outlined in Section 4, 
all of the clusters in our sample have spectroscopic redshifts and for nearly half we also have 
follow-up X-ray data that significantly improve the mass determinations. 

Among the SPT results, Benson et al. use a smaller area of the survey than Williamson 
et al., but have an improved mass calibration and extend their sample to lower SZ detection 
significance (i.e. lower mass). In comparison, our cluster data set is significantly larger than 
either of the SPT cluster samples, contains more massive clusters (although at lower red- 
shifts) , has a larger intrinsic scatter in the mass-observable relation (although the parameters 
of the scaling relation are better constrained) , and uses a more straightforward mass calibra- 
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Figure 6. Comparing mass functions. Black (filled point down triangles): LMSV with one term. 
Blue (open point down triangles): LMSV with hierarchical scaling, keeping 4 terms. Purple (open 
point up triangles): LMSV with feeder scaling, keeping seven terms. (The number of terms was chosen 
to give good behavior up to /nl — 300 at redshift 1.) Red (squares): Dalai et al. + Tinker et al. 
Note the difference in scale on the vertical axes in the top row compared to the bottom row. 



tion (i.e. directly incorporating X-ray mass measurements rather than calibrating mass via 
simulation priors or priors based on external X-ray data) . The masses and redshifts of these 
three cluster samples are compared in Figure 1. 

Both of the SPT studies used CMB data to constrain the amplitude of the Gaussian 
power spectrum, as well as other cosmological parameters, as we did in Section 4. Both also 
used the Dalai et al. non-Gaussian mass function, so it is not immediately clear which of our 
constraints is most comparable to theirs. However, comparing the various mass functions 
for the masses and redshifts of the SPT clusters and non-Gaussianity of magnitude /jIJl^' ~ 
C(IOO), the better comparison seems to be with our skew-only results (Figure 6). Our 



1+234 



clusters+CMB constraint for the skew-only model corresponds to /^l = —^-q^^, roughly a 
factor of two tighter than the constraints of Benson et al. , but less of an improvement than 
a straightforward dependence on the cluster sample size (237 versus 18) would imply. The 
precision of the mass calibration used in the two works is similar, and most likely limits the 
improvement we see from the larger data set. The greater constraining power of the Benson 
et al. analysis (18 clusters) versus that of Williamson et al. (26 clusters) further underscores 
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the impact of providing additional data to improve the cluster mass calibration. On the other 
hand, the significant improvement in the constraining power of our full data set compared 
with only the low-redshift portion (see Figure 5) indicates that our current constraints are 
still not entirely systematically limited. 

Compared to both our data and the SPT samples, the maxBCG catalogue used by 
Mana et al. [17] contains a very large number (13 823) of mostly less massive clusters at low 
redshifts (0.1 < z < 0.3) [29]. Their mass calibration is accomplished through a stacked weak 
gravitational lensing analysis, which constrains the mean optical richness-mass relation but 
does not provide mass estimates for individual clusters [52, 53]. This feature, as well as the 
lack of high mass and high redshift clusters, presumably limits their constraining power. We 
note, however, that the large range in mass probed allows them to achieve from cluster counts 
alone constraints comparable to those of Benson et al. from clusters+CMB. Our skew-only, 
clusters-only constraint (/^l ~ "^^-78 ) ^^ comparable to that of Mana et al. 

There are some existing constraints on primordial non-Gaussianity beyond the skewness 
that are complementary to those we find here. For example, the CMB data from the WMAP 
satellite has been analyzed by [10, 54]. The Planck data has not yet been exhaustively 
analyzed for evidence of a trispectrum (a four-point correlation function), but the collabo- 
ration has reported the strongest bound to date for one of the simplest momentum-space 
trispectrum shapes. This shape is part of the local family of non-Gaussianity (it is the four 
point function generated by Eq.(1.3)), but its amplitude can be constrained independently 
of the amplitude of the three-point function {f^ff^) and is typically called tnl- Planck data 
constrains the amplitude of this shape to be t^l < 2800 [10], which is much weaker than 
the constraint implied by Planches tight limits on /]^£^^ for a single parameter model with 
hierarchical scaling. However, the weakness of this constraint is likely to be due in large part 
to the remaining foreground and systematic effects present in and yet to be removed from the 
current analysis. For comparison, our central value in the hierarchical case, Ai^ = —0.001 
corresponds to a model with tnl ~ 9 (using the fitting function of [55] at 10^^h~^ Mq). 
Our central value for the feeder scaling, which is much more non-Gaussian, corresponds to 
Tnl ~ 12 300. By introducing the parameter q, we also tested models where the relationship 
between the skewness and the kurtosis was relaxed, although the rest of the series was still 
determined in terms of the two parameters. The constraints on non-Gaussianity from Planck 
data using Minkowski functionals do incorporate information from the series of cumulants 
in a way similar to cluster number counts. These could be more directly compared to our 
results with some additional analysis. 

Recently Giannantonio et al [14] reported a constraint on non-Gaussianity from a clus- 
tering analysis of a broad ensemble of multiwavlength galaxy surveys. In that analysis, 
/]y|J^^ is degenerate with a trispectrum parameter^fTVL (the amplitude of the bispectrum 
generated by adding a term proportional to $^ to the local ansatz). For fj^i = they 
report —4.5 x 10^ < gNL < 1-6 x 10^ (95% C.L.) while if f^^f^^ is nonzero and positive 
they find that gi\iL can take somewhat more negative values. Our results, using the hierar- 
chical scaling appropriate for weakly non-Gaussian local models, are comparable. Again 
using a fitting formula from [55] at W^'^h~^ Mq, our largest allowed magnitude for the 
kurtosis (lA^4l ~ 5 x 10~^ for the one-parameter model with approximate 95% bounds 
l-^sj ^ 60 X 10^'^) corresponds to \gNL\ ^ 9 x 10^. Allowing the two-parameter model with 
(7 ~ 0.1 relaxes this constraint to {qnlI ^ 9 x 10^. If the same trispectrum shape is considered 
with feeder scaling for the rest of the moments, our results imply {qnlI ^ 3 x 10"' for the 
single parameter model and IqnlI ^ 6 x 10^ with q = 0.18. 
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6 Conclusions 

We have used the mass and redshift distribution of X-ray bright clusters to find that the 
primordial inhomogeneities in the gravitational potential can be consistently described by 
a Gaussian distribution to an accuracy of about one part in 10^ at scales around 0.1- 
0.5 h Mpc~ . Our constraints apply to any model of weak non-Gaussianity that has a 
sufficiently regular ordering in the cumulants to be modeled by one of the expressions in 
Section 2 or Section 3 (Equations (2.7), (2.8), (3.1), (3.4), (3.5) or (3.6)). In particular, for 
non-Gaussianity described by the local ansatz (Eq.(1.3), which defines all cumulants for the 
model), we find /^l = ~3-9i from cluster data combined with the WMAP 7- year data. 

Our analysis differs from previous work in that we include higher order cumulants in 
the non-Gaussian mass function. This allows us to differentiate constraints on two different 
one-parameter models for the non-Gaussianity, characterized by the relevance of moments 
beyond the skewness. We also tested the sensitivity to several two-parameter models, but 
found that current data are not very sensitive to this extra level of detail. Our full set of 
results can be found in Table 2. Table 3 shows those results interpreted in terms of several 
popular models for the bispectrum. 

The Planck satellite data recently led to very tight constraints on any non-Gaussianity 
on a significant range of scales, but those bounds are still about two orders of magnitude 
above the minimal levels predicted by slow-roll inflation. Some combination of lower-redshift 
probes will be required to explore the rest of that parameter space. This is an extremely 
worthwhile pursuit, since higher order correlations (or their absence) is our only route to learn 
more about the primordial era. The previously unexplored sensitivity of cluster data to higher 
order moments of non-Gaussianity shown by our results argues for revisiting the forecasts for 
future cluster surveys. Further bounds on non-Gaussianity remain as theoretically interesting 
as more precise measurements of the dark energy equation of state. Since the next stage 
of research on the primordial era will be dominated by large scale structure observations, 
it is crucial to understand the full potential of complementary information from cluster 
number counts together with CMB and large scale structure constraints on the bispectrum 
and trispectrum, and the halo bias. 

There are several ways our analysis could be extended. First, in order to consistently 
treat as large a family of non-Gaussian models as possible, we have used a simple, semi- 
analytic form for the non-Gaussian mass function. It would be preferable to use non-Gaussian 
mass functions calibrated on simulations, but those results do not yet exist for a sufficient 
variety of scenarios. However, simulations of a two-field local model that spans between the 
hierarchical scenario (with the extra parameter q) and the feeder behavior are in progress 
[56]. Once this simulation work is complete, we could revisit our analysis using the Dalai 
et al. mass function for the local ansatz together with a comparable expression for models 
whose higher moments are relatively more important. 

The cluster data sets that have been used to constrain non-Gaussianity so far differ 
qualitatively in several respects. Nevertheless, empirical comparison of their constraining 
power underscores that the precision of the overall mass calibration, the availability of mass 
estimates for individual clusters, and the mass and redshift ranges probed all have an impor- 
tant role. In the very near term, combining the available survey data as well as gravitational 
lensing data (e.g. [57, 58]) in a multi- wavelength analysis has significant potential. 

A variety of upcoming survey data could also improve on the non-Gaussianity con- 
straints in complementary ways. Data from, for example, the optical-wavelength Dark En- 
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ergy Survey [59] and the upcoming eROSITA all-sky X-ray survey [60] will extend the mass 
range of the cluster samples downward over a wide redshift range, and enable clustering 
analyses. Continued SZ surveys will be crucial for detecting the most massive, high redshift 
clusters which are most sensitive to non-Gaussianity. In addition, targeted X-ray follow-up 
of individual clusters will allow the more precise mass measurements that significantly aid 
the overall statistical power of cluster samples. All of this data is being collected to study 
important outstanding problems in cosmology, especially dark energy and neutrino mass, 
but is also well suited to further tests of non-Gaussianity to study infiation. Constraints 
on the primordial fluctuations from cluster counts and clustering provide a complementary 
cross-check to the CMB and to other large scale structure probes, and will continue to be an 
important tool for cosmology. 
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A Non-Gaussian mass functions 

To model the effect on non-Gaussianity on the cluster mass function, we use an extended 
Press-Schechter approach [61], following previous work in [30, 55, 62-64]. The non-Gaussian 
probability distribution of (normalized) density fluctuations smoothed on a scale R (associ- 
ated to halos of mass M hy R = ((3/47rp)M) ' ) is P(z^, M). Here z/ = 6/aji (with aji being 
the smoothed variance). The fraction of volume in collapsed objects (halos) is 

/■oo 

F{M) = 2 duP{v,M) (A.l) 

with 5c the threshold for collapse, and where we have included the Press-Schechter factor of 
2 in front of the integral. Then the number density dn{M)/dM of halos with masses between 
(M, M + dM) is 

dn , ^ ^ . p dF , . , 

where p = ^mPcnx. is the average (comoving) matter density. The smoothed density fleld is 
given by 

^Wn{k)5{Kz) (A.3) 
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where Wptik) is the Fourier transform of a window function, which we take to be a top-hat 
in real space. The smoothed variance is 

a\M, ^) = ^ [ dk Piinik, z)W\k, M)k^ . (A.4) 

The matter perturbations 5 at redshift z are related to the perturbations in the early matter 
era potential $ by 

5{k,z) = M{k,z)^{k) (A.5) 

M{k,z) = ~^^D{z)^nk)k^ . 

Here ilm is the matter density relative to critical, Hq is the Hubble constant, D{z) is the 
linear growth function at redshift z normalized to one today, and the growth suppression 
factor is g{z = 0)/g{z = oo) ~ 0.76 in the best-fit ACDM model. The variance of density 
fluctuations at redshift z smoothed on a scale R associated to mass M is a'^{M,z), defined 
by 

a\M,z)= / ^Wn{kfM{k,zfAl{k), (A.6) 

Jo k 

where A|(/c) is the amplitude of fluctuations, related to the power spectrum by P^{k) = 
{27Tyk^)Al{k){k/kor-\ 

Petrov [65, 66] (references are the English translations) developed an asymptotic ex- 
pansion for non-Gaussian PDFs, which is a generalization of the Edgeworth expansion. In 
terms of the dimensionless smoothed moments Ain,R, this is 



V2^ ^ 



s=l{fe,„} m=l ™ VV -r ; / J 

Here Hn{i^) are Hermite polynomials defined by Hn{i') = (— l)"'e'^ 'du"^~'^ ^^'^ ^ ~ 
ki + k2 + ■ • • + kn where the set {km} is built of all non- negative integer solutions of 

ki + 2k2-\ h nkn = s . (A.8) 

We use the two scalings, hierarchical and feeder, from Eq.(2.7) or Eq.(2.8), to organize the 
terms in this series. Then we can write 

F(M) = Fo(M)+Fi(M) + F2(M) + ... (A.9) 

where 

Fo(M) = ^Erfc (^) (A.IO) 

and the rest of the series is ordered according to the scaling: 

y? /2 °o s 1 / A ^ \ fe 






S = l {km}f 
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The sets {km}h are again non-negative integer solutions to ki + 2k2 + • • • + sks = s and 
r = ki + k2 + - • • + km, but the sets {km}f are solutions to S/ci + 4A;2 + • • • + (s + 2)ks = s + 2. 
Now for either scaling, truncating the series at some finite s in the sums above keeps all terms 

s/3 

up to the same order in Ai^: 7W| for hierarchical scalings and Ai^ for feeder scalings. 

To write the mass function we will need derivatives of all the terms in the expansion 
with respect to mass (or smoothing scale). In general, the derivatives can be found using the 
relationship for the Hermite polynomials: 



i>Hn{v) 



dHniv) 
du 



Hn+l{l^) ■ 



(A.12) 



The ratio of the non-Gaussian Edgeworth mass function to the Gaussian has the same struc- 
tural form for either scaling: 









Edgeworth ^0\^) 

with the derivatives of each term F'g = dFg/dM for s > 1: 

Mm+2,R 

\ i^m. I h 



,h/ 



Fn^) 



Fl'ii^) 




(A.13) 



(A.14) 



where the {k^} again satisfy the relationships given below Eq.(A.ll) and we have used 



F^ 



da Vc 



27r dM a 



(A.15) 



We have shared computer code to evaluate series for these two scalings at http : //www . slac . 
stanford.edu/~amantz/work/nongaussl3/. 

A.l Other non-Gaussian mass functions 

The mass function above has been shown to agree relatively well with A^-body simulations 
of the local ansatz [45, 55], although there is some disagreement in the literature about just 
how well it agrees [67]. Some work has found evidence of a better fit if one shifts the collapse 
threshold [68], but our data are not sensitive to that level of detail. Ideally, one would like 
to use mass functions explicitly calibrated on simulations. To that end, Dalai et al. [20] 
proposed the following 

dn /' , , ^ dn dP , . ^ , , , ^ 

d^^G^fnr^ni^G) (A.16) 



dM 



' dMc dM 
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where dn/dMQ is the Gaussian mass function (which Dalai et al. took to be Jenkins et al. 
[48]) and dP/dM{MQ) is the probability distribution describing how a Gaussian halo of mass 
Mq maps into a non-Gaussian halo with mass M. Dalai et al. gave fitting formula for this 
as a Gaussian distribution with: 



var (^) = 1.4 X lQ-\f^^a,f-^a{MG,z)~^ 



(A.17) 



For a fixed value of /nlj this mass function deviates less from the Gaussian than the LMSV 
mass function does. This is an interesting approach, but has not been tried for non-local 
non-Gaussianity. 

For reference, we collect here the various non-Gaussian mass functions that exist in the 
literature. The mass function can be parametrized as 



dn 
dM 



{M,z) = f{a) 



p dln[a-'^{M,z)] 



M 



dM 



(A.18) 



where /(cr) will differ between various models. For reference, we compile here /(o") for the 
(Gaussian) Tinker mass function [33] (/t,a=30o)) the LMSV mass function [30] truncated 
at first order (/lmsv,i)) the Dalai et al. [20] mass function (/oaiai); and the LMSV mass 
function keeping a large number (n) of terms with either hierarchical (/lmsv n) '-'^ feeder 
(/lmsVh) scaling for the cumulants: 



/t,a=30o(o") 
/lmsv,i 



A 




1 
+ 1 


e-/-^ , 


h 


A=300 


1 + = 


F^^'(M)" 
F^(M) _ 



(A.19) 



/t,a= 



:300 



TWa , jj adM-s 
"^ 3! V da 



/Dalal(o"J\/) 



d\n[a~^{M,z)] 



dM 



-1 



2vrcJAf{ 



-exp 



dMc /t,a=30o(o"Mg) 

Mg ~ \Afc^^ 



d\n[a-\MG,z) 
dMc 



( M _ I M \\2 

\ Mn \ Mn I ) 



20-2 



JLMSV,n 



/t,a= 



:300 



:ih,f/ 



1 + 



/-^ pi 

.=1 ^0 



In general, anywhere we have written /t,a=300 in the non-Gaussian mass functions, one can 
substitute any other Gaussian model. For models with hierarchical scaling, it is common to 
use the skewness parameter 53 = M^/a. With that definition, it is clear that the second 
line is equivalent to the expression used for the LMSV mass function in [7, 8], 



/lmsv,i = /t,a= 



=300 



1 + 



1^ 
6 5, 






2^ 
2 



1 + 



'^c^ 



6 dlno" V (7^ 



1 dS^ 



1 



(A.20) 
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